Geo-Spatial Analysis on UFO sighting data

Experiment
Geo-Spatial
Author

F.L

Published

June 8, 2024

Introduction

I first encounter UFO for my master degree. Not literally. I need a project for my “Data Viz” module. But back then my geo-spatial analysis skills was limited.

This little project will look over these data an attempt to answer questions such as:

  • Where can I find aliens?
  • What does a UFO typically looks like (based on description)?
  • Has alien left us since the 1980s? (temporal patterns)

UFO on map

The Data

Data includes both logitude location and times. So can do both time series or spatial viz.  Source: https://www.kaggle.com/NUFORC/ufo-sightings

The complete data includes entries where the location of the sighting was not found or blank (0.8146%) or have an erroneous or blank time (8.0237%). Since the reports date back to the 20th century, some older data might be obscured. Data contains city, state, time, description, and duration of each sighting.

library(tidyverse)
library(sf)
library(leaflet)
## read data and clean date
ufo <- read_csv('./complete.csv') |> 
  janitor::clean_names() |> 
   mutate(
     datetime = as_datetime(datetime, format = "%m/%e/%Y %R")
    ,date_posted = as_date(date_posted, format="%m/%e/%Y"))
Warning: One or more parsing issues, call `problems()` on your data frame for details,
e.g.:
  dat <- vroom(...)
  problems(dat)
Warning: There was 1 warning in `mutate()`.
ℹ In argument: `date_posted = as_date(date_posted, format = "%m/%e/%Y")`.
Caused by warning:
!  196 failed to parse.
## convert to sf object
ufo_points = ufo |> 
  filter(!is.na(latitude) & !is.na(longitude)) |> 
  filter(!(latitude == 0 & longitude == 0)) |> 
  drop_na() |> 
  sf::st_as_sf(coords=c("longitude","latitude"),crs=4326)

# df_scr <- read_csv('./scrubbed.csv') |> janitor::clean_names()

## Below used to analysis how to format date
# ## analysis to check which one is digit which one is date, which one is month
# df_com |> 
#   select(datetime) |> 
#   mutate(
#     md1 = str_extract(datetime, "^\\d{1,2}")
#     ,md2 = str_extract(datetime, "(?<=[\\d]{2}\\/)\\d{1,2}")
#   ) |> 
#   distinct(md1,md2)
# ## first digit is month, second is date
# stopifnot(0!=df_com |> 
#   select(datetime) |> 
#   mutate(datetime_ = as_datetime(datetime, format="%m/%e/%Y %R")) |> 
#   filter(day(datetime_) > 12) |> 
#   nrow())
# ## the format you want to extract is "%m/%e/%Y %R" use lubridate dattiem

Grand Map View

Only a proportion of data are kept.

[1] "The propotion of data visualised is: 0.78"
[1] "1910-01-02 UTC" "2014-05-08 UTC"

This is all the data we can find

create UFO icon map - make icon
## over all scale using 
ufo_icon = makeIcon(iconUrl = "./asset/ufo.png",
                    iconWidth = 25, iconHeight = 25)

## a javascript function which creates clusters
clus_func_js = function() {
  JS("  
    function(cluster) {  
       return new L.DivIcon({  
         html: '<div style=\"background-color:rgba(77,77,77,0.05)\"><span>' + cluster.getChildCount() + '</div><span>',  
         className: 'marker-cluster'  
       });  
       }")
}

addUFOMarker = function(x) {
  addMarkers(
    x,
    # ~longitude, ~latitude,
    icon = ufo_icon,  
    clusterOptions = markerClusterOptions(  
      iconCreateFunction = clus_func_js()
    ),  
    label = ~ label,
    popup = ~comments_,
  )
}
create UFO icon map - heatmap
## access polygon data
country_polygons = spData::world |> 
  filter(iso_a2 %in% c("GB","US","CA"))

## make grid on top of the layer
mesh=sf::st_make_grid(
     country_polygons
    ,cellsize = c(0.5,0.5)
    ,square = T
  )
## intersect points 
ids = st_intersects(mesh,ufo_points)
which_non_zeros = which(lengths(ids) != 0)
pixels =st_as_sf(x=mesh[which_non_zeros])
pixels$n = lengths(ids)[which_non_zeros]
create UFO icon map - the map
## leaflet color palete
pal <- colorNumeric(
  palette = "PuBu",
  domain = pixels$n)

## use base layer of map
## set view to USA
uk_lon <- -95.71 #W
uk_lat <- 37.09 #N
base_map = ufo_points |> 
  mutate(
     label = paste(city,state,country,shape)
    ,comments_ = paste0(datetime,": ", comments)
  ) |> 
  leaflet()  |> 
  setView(lng = uk_lon, lat = uk_lat, zoom =4) |> 
  addProviderTiles(providers$CartoDB.Positron)

## map
base_map |> 
  addPolygons(data=pixels,stroke=F,fillColor = ~pal(pixels$n), fillOpacity = 0.8) |> 
  addUFOMarker()## leaflet color palete

If you explore this map, you will find there are barely any record (or at least on this map) on UK. The majority of sighting concentrate on US. There are also duplicated sightings…

Clustered Perhaps many people were spotting the same event.